Action Recognition with Stacked Fisher Vectors
نویسندگان
چکیده
Representation of video is a vital problem in action recognition. This paper proposes Stacked Fisher Vectors (SFV), a new representation with multi-layer nested Fisher vector encoding, for action recognition. In the first layer, we densely sample large subvolumes from input videos, extract local features, and encode them using Fisher vectors (FVs). The second layer compresses the FVs of subvolumes obtained in previous layer, and then encodes them again with Fisher vectors. Compared with standard FV, SFV allows refining the representation and abstracting semantic information in a hierarchical way. Compared with recent mid-level based action representations, SFV need not to mine discriminative action parts but can preserve mid-level information through Fisher vector encoding in higher layer. We evaluate the proposed methods on three challenging datasets, namely Youtube, J-HMDB, and HMDB51. Experimental results demonstrate the effectiveness of SFV, and the combination of the traditional FV and SFV outperforms state-of-the-art methods on these datasets with a large margin.
منابع مشابه
Hyper-Fisher Vectors for Action Recognition
In this paper, a novel encoding scheme combining Fisher vector and bag-of-words encodings has been proposed for recognizing action in videos. The proposed Hyper-Fisher vector encoding is sum of local Fisher vectors which are computed based on the traditional Bag-of-Words (BoW) encoding. Thus, the proposed encoding is simple and yet an effective representation over the traditional Fisher Vector ...
متن کاملRNN Fisher Vectors for Action Recognition and Image Annotation
Recurrent Neural Networks (RNNs) have had considerable success in classifying and predicting sequences. We demonstrate that RNNs can be effectively used in order to encode sequences and provide effective representations. The methodology we use is based on Fisher Vectors, where the RNNs are the generative probabilistic models and the partial derivatives are computed using backpropagation. State ...
متن کاملHistogram of Oriented Depth Gradients for Action Recognition
In this paper, we report on experiments with the use of local measures for depth motion for visual action recognition from MPEG encoded RGBD video sequences. We show that such measures can be combined with local space-time video descriptors for appearance to provide a computationally efficient method for recognition of actions. Fisher vectors are used for encoding and concatenating a depth desc...
متن کاملComparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions
We present a comparative evaluation of various techniques for action recognition while keeping as many variables as possible controlled. We employ two categories of Riemannian manifolds: symmetric positive definite matrices and linear subspaces. For both categories we use their corresponding nearest neighbour classifiers, kernels, and recent kernelised sparse representations. We compare against...
متن کاملRapid and Brief Communication Alternative linear discriminant classi$er
Fisher linear discriminant analysis (FLDA) $nds a set of optimal discriminating vectors by maximizing Fisher criterion, i.e., the ratio of the between scatter to the within scatter. One of its major disadvantages is that the number of its discriminating vectors capable to be found is bounded from above by C-1 for C-class problem. In this paper for binary-class problem, we propose alternative FL...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014